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Abstract — 

We prove unconditional security for a quantum key dis- 
tribution (QKD) protocol based on distilling pbits (twisted 
ebits) [l from an arbitrary untrusted state that is claimed 
to contain distillable key. Our main result is that we can 
verify security using only public communication — via pa- 
rameter estimation of the given untrusted state. The tech- 
nique applies even to bound entangled states, thus extend- 
ing QKD to the regime where the available quantum channel 
has zero quantum capacity. We also show how to convert 
our purification-based QKD schemes to prepare-measure 
schemes. 



I. Background, problem, and result 

A large class of Quantum Key Distribution (QKD) pro- 
tocols are based on entanglement-purification-protocols 
(EPP). We use the shorthand EPP-QKD for these pro- 
tocols. It is known that a secure key can be obtained by 
locally measuring two systems prepared in some maximally 
entangled state (also known as EPR pairs [2] or ebits) . The 
security and working principle of EPP-QKD are based on 
the ability of two separated parties to estimate error rates 
of an untrusted shared state relative to ebits and to sub- 
sequently distill ebits. In [I], it was found that the most 
general quantum state (known to the users) which provides 
a secure key (after measurement) is not an ebit. It is called 
a pbit or "twisted ebits." These pbits can likewise be dis- 
tilled or purified from known shared states. This paper 
is focused on the scenario when the users share untrusted 
states, and how to devise QKD schemes under such cir- 
cumstances based on pbit-purification-protocols. We call 
these protocols PPP-QKD. The main goal is to devise an 
analoguous error estimation scheme relative to pbits, using 
only public classical communication. The scheme applies 
to some "bound entangled" initial states that are nonethe- 
less sufficiently close to pbits. (A state is bound entan- 
gled if no ebits can be distilled from many copies of it.) 
Consequently, there are channels that cannot be used to 
send quantum information (zero quantum capacity), but 
that can be used for QKD (nonzero key capacity). Fur- 
thermore, in spirit similar to [3], we provide a recipe for 
converting PPP-QKD to their associated prepare-measure 
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schemes (P/M-QKD ). We will concentrate on the verifi- 
cation scheme of Lo, Chau, and Ardehali [4], [5] where bit 
and phase error rates are estimated. We instead estimate 
"twisted" bit and phase error rates. Our proof uses classi- 
cal random sampling theory, and the exponential quantum 
de Finetti theorem [6]. 

We first provide a pedagogical review on the essential con- 
cepts of QKD in Sec. II- Al Readers familiar with QKD can 
skip the review. We then discuss the current problem in 
Sec. II-BI followed by a precise statement of our results in 
Sec. II-DI as well as related results in Sec. II-E1 The proof of 
security is contained in Sec. [II] with the essence of it being 
in Sec. III-DI We follow this up by a discussion on how 
to convert our protocol to a P/M-QKD scheme in Sec. IIIII 
and give an example of QKD using a binding-entanglement- 
channel in Sec. IIVI where the error rate is so high that quan- 
tum capacity vanishes. An interesting observation that the 
users will not need to know about what private state they 
share and how to exploit this fact are given in Sec. [VlJ We 
have built our protocols piece by piece, and a summary of 
the complete protocols is given in Sec. [Vj We conclude 
with other remarks and a discussion of open problems in 
Sec. IVIII Proofs are detailed in the Appendix, and the 
theorems are restated in the body of the paper. 

A. Review of quantum key distribution (QKD) 

In the quantum world, it is generally impossible to extract 
information about a quantum state without disturbing it 
[7J. This principle enables unconditionally secure key dis- 
tribution that is impossible classically. Key distribution is 
the task of establishing a key between two parties, Alice 
and Bob. Informally, a key distribution protocol is secure 
if the probability to establish a compromised key vanishes. 
(In the above statement, we have allowed the key length to 
vary and when it is zero, the protocol "aborts" . See also 
[8].) If a protocol (given some stated resources) is secure 
against the most powerful adversary (Eve) limited only by 
laws of physics, its security is "unconditional." 

In QKD, Alice and Bob can use a quantum channel (from 
Alice to Bob) and classical channels (in both directions). 
These can be noisy and controlled by Eve. In addition, Al- 
ice and Bob have local coins and in some cases, quantum 
computers. These can be noisy but they are not controlled 
by Eve. Finally, Alice and Bob share a small initial key. Us- 
ing the quantum universal composability result [9] , most of 
the imperfect resources can be made near-perfect while pre- 
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serving security - the classical channels and local resources 
can be made reliable and authenticated using the initial 
key and coding. We make these simplifying assumptions 
from now on, and focus on imperfections in the quantum 
channel. (As a side remark, for arbitrary adversarial im- 
perfections in the quantum channel, no coding method can 
convert it to a perfect quantum channel. Fortunately, QKD 
requires less (see above) and this paper revolves around the 
minimal requirement on the quantum channel.) 

We first give the intuition behind the security offered by 
quantum mechanics assuming a noiseless quantum channel. 
Alice and Bob pre-agree on a set of non-orthogonal quan- 
tum states, each may be transmitted by Alice through the 
quantum channel with some probabilities. (This is the case 
in the earliest QKD scheme called BB84 lOj.) Eve can in- 
tercept and compromise the quantum signals but they will 
be disturbed. Bob tells Alice when he receives the states, 
and they subsequently detect disturbance using some of the 
states, and if they observe none, they extract a key from 
the rest of the states; otherwise, they abort the protocol. 

QKD based on noisy channels is important for two reasons. 
First, natural noise is inevitable and can be used by an 
eavesdropper as a disguise. Second, it is desirable to be 
able to generate a key despite some malicious attack. Initial 
work [llj . |12j was done based on error estimation, privacy 
amplification [13j . and error correction. Mayers first gave 
an unconditional security proof for QKD [T3] , showing that 
BB84 can provide a key up to w 8% observed error. 

Later on, Lo and Chau [3] reported a security proof for 
a different QKD scheme based on E91 [TS] - Alice and 
Bob first share some noisy, untrusted state p. (We tag the 
state with a "question mark" to emphasize that the users 
cannot ascertain its identity.) It is supposed to be n copies 
of := -7^X^=1 K)aK)-B where {\i)} is a computational 
basis for the local systems A and B possessed by Alice and 
Bob respectively |$<j) 1S called a "maximally entangled 
state" or MES for short. When d = 2 it is also called 
an EPR pair or "ebit" . p arises from Alice preparing n 
local copies of and transmitting Bob's shares through 
an untrusted channel of d dimensions. Eve can attack on 
all n systems jointly. For now, we focus on the d = 2 
case (just like [1]). After Bob receives the state, Alice and 
Bob extract a smaller number m of nearly perfect ebits, 
from which a key is obtained by measuring in the local 
computational basis. It is possible for m = 0, when QKD 
is aborted. The Lo-Chau proof is simple - however the 
noise arises, just detect and remove it, and doing so only 
involves standard techniques in entanglement purification 
protocols (EPP) or distillation [16]. 

The disadvantage of the Lo-Chau proof is that, its asso- 
ciated scheme requires quantum storage and coherent ma- 
nipulation of quantum data, neither of which is required 
in BB84. Shor and Preskill [3] provided a recipe to relate 
BB84 to the E91-Lo-Chau scheme, such that the security 
of the former is implied by that of the latter. Furthermore, 
[3] generalizes to many other variants of BB84 (collectively 



called "prepare-measure" scheme P/M-QKD ) so that their 
security can be proved via that of a related purification- 
based QKD scheme. 

B. Step-by-step QKD and motivation of current problem 

We discuss useful general concepts by interpreting the Lo- 
Chau scheme [1] as follows. Alice and Bob preagree on a 
set of parameters e for states, and let sets of states shar- 
ing the same parameters be labeled as S e . (It will be clear 
later how they should be chosen.) The protocol is a 4-step 
process for Alice and Bob: 

(1) Distribute an untrusted bipartite state p using the un- 
trusted resources. 

(2) Perform tests (via public discussion) on p such that if 
p G S e , the test will output e with high probability. 
They only need to know which e (or S e ) but not which p, 
and the remaining procedure depends only on e and applies 
to all states in S e . For example, in the Lo-Chau scheme, 
e consists of two error rates (bit and phase), S e is the set 
of states arising from inflicting errors of rates e to the Bell 
states (see Def. Q]and Eq. ([3]) towards the end of Sec. Ill- Al 
for a precise definition). 

(3) Based on the parameter e, apply an appropriate EPP 
to p and output a state 7. 

This procedure, if applied to any state in 5" e , will return 
a state 7 which is a good approximation of a known and 
trusted state 7 (e.g. ebits in E91/Lo-Chau). 

(4) Generate a key by measuring 7 locally. 

The key can have varying size (depends on e), and zero 
key-length means "abort QKD." 

We will refer to these 4 steps (and their variations) repeat- 
edly throughout the paper. 

To generalize the Lo-Chau scheme, we examine the require- 
ments for each of these steps (in reverse order). 

We start with step (4): Simply suppose Alice and 
Bob share a known and trusted state 7. What 7 
(other than \$d)) will generate a secure key? Refer- 
ence pQ characterizes all such 7 (up to local unitaries on A 
and B): 

7^ = U{<S> dA B® PA>B>)rf (1) 

U = ^\i])^]\AB®U ljA . B i (2) 

ij 

where $d = ($^1 is the MES of local dimension d, the 
subscripts AA' and BB' denote systems held by Alice and 
Bob, Uij are unitary so that U is also unitary, and pa'B 1 
is any state (pure or mixed) of some arbitrary dimension 
d' . (Note that dim(7^) = d 2 d' and the key generated has 
size logrf.) U in Eq. is called the twisting operator, and 
any 7^ given by Eq. (p} is called a pdit (or private state, 
or twisted state or gamma state). In some sense, Eq. ([I]) 
and Eq. ((2]) characterize all the noise on an MES that is 
harmless for the purpose of generating a key. We state this 
property for the twisting operator U more precisely. 

Observation 1: (See pQ, [17j ) Let U be any twisting opera- 
tor, and consider any two states paa' bb> , o~aa> bb' related 
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by oaa'bb' = U paa'bb'U^ . Let both be purified by E. 
(See definition in Sec. III-Bl right before Eq. ((4}.) Then, if 
Alice and Bob measure A and B in the computational ba- 
sis, the reduced states on ABE, pabe and (Jabe-, are the 
same. 

Such postmeasurement states are called ccq states, for Al- 
ice and Bob hold classical systems, while Eve's state re- 
mains unmeasured and quantum. 

Now consider step (3): Which states can be con- 
verted into a good approximation of a private state 

(7)? We call such states "key distillable" (even though they 
are not of tensor power form). The conversion procedure 
has to work for all states in S e . A complete characteriza- 
tion is unlikely to be tractable and we only have examples. 
The canonical example in [4] is the set of states with suf- 
ficiently low error rates relative to perfect ebits. We will 
call these "e-good-ebits" . These are not necessarily tensor 
power or product states (see Def. [1] and Eq. Q). Here, 
EPP works for all e-good-ebits independent of which one is 
the initial state. Another example are tensor power states 
er®". In this case, one says that a protocol achieves a "key 
rate" r if it converts a® 11 to « 7^ for some U given by 
Eq. ([2]), allowing n to be asymptotically large. For exam- 
ple, protocols and lower bounds for r are found in [18] for 
general a. 

We mention some surprising facts about private states and 
key-distillable states. All perfect pdits (7) contain some 
distillable entanglement. However, there are families of 
pdits with vanishing amount of distillable entanglement 
but can be used to provide a key with constant rate. Also, 
there are states close to pdits and have distillable key (lower 
bound from |18j ) but have no distillable entanglement (up- 
per bound from showing the positivity of the partial tran- 
pose (PPT) [Tg, [20]). 

Switching from trusted states to untrusted states, we now 
move on to the main concern of this paper: In step (2), 
what sets S e contain key distillable states and admit 
parameter estimation? What are the correspond- 
ing tests for finding if p 6 S e l In the Lo-Chau proof, 
7 are ebits and S e can be chosen to be e-good-ebits (these 
are states with bounded error relative to ebits, see Def. [1] 
and Eq. ©). In the most general case, 7 is a pdit and a 
natural question is, can all key-distillable S e be tested? 

We believe that the above question is hard, by considering 
all possible "e-good-pbits" - states obtained from applying 
any twisting operation to e-good-ebits, where the twisting 
can act jointly on the entire system. Without further re- 
striction on the joint twisting operation, it is unclear how 
to perform parameter estimation on the joint state. 

One particularly useful class of e-good-pbits are those ob- 
tained from applying tensor-power twisting U® n on e-good- 
ebits. We will call these states <g>-twisted-e-good-pbits 
(note that just like e-good-ebits, <8>-twisted-e-good-pbits 
need not be tensor power states). (See also Def. O) 

Prior to this work, Ref. [21] showed how to perform param- 



eter estimation for S e containing (g)-twisted-e-good-pbits 
that have some distillable entanglement. There, the im- 
portant distinction from the Lo-Chau scheme is that, in 
Ref. [21] , entanglement is only distilled for parameter esti- 
mation but not for the subsequent key generation. In par- 
ticular, the entanglement distilled in the scheme of Ref. [5T] 
can be in negligible quantity compared to the key size. 
But [21] leaves many questions unanswered, in particular, 
whether S e can contain bound entangled but key distillable 
states, and whether distilling entanglement (albeit a little) 
is necessary. Also, the test in [3T] prevents easy conver- 
sion to a simpler class of schemes called prepare-measure 
schemes (P/M-QKD, see below). 

In this paper, we will show by an explicit protocol that pa- 
rameter estimation is possible for all S e containing states 
which can be converted into (g>-twisted-e-good-pbits by 
LOCC operations (involving only local operations and pub- 
lic classical communications). Furthermore, this new esti- 
mation procedure does not involve distillation so that it 
applies to bound entangled states; it only involves prod- 
uct observables (see Def. [3J, allowing easy conversion to 
P/M-QKD, as we will see later. 

C. Our adversarial setting 

Throughout the paper, we are concerned with uncondi- 
tional security of QKD, in which nothing is assumed about 
the actual channel used or about the actual state shared p. 
There are three separate notions that we want to mention 
explicitly. 

• Alice and Bob use an underlying quantum resource "a" 
(a channel or quantum state) in order to execute QKD. 
They have some knowledge about this resource, for exam- 
ple, the natural channel loss due to their distance can be 
theoretically calculated. 

• During any specific execution, this resource is subject to 
further unknown attack to produce the actual channel or 
state p "b" . This will remain unknown to Alice and Bob 
throughout. 

• What is known to Alice and Bob in an execution is a set 
of observed error rates "c" . 

The insecurity of QKD can be quantified by the probabil- 
ity that the state has been compromised more than the ob- 
served error rates have suggested. Security is a consequence 
of the test procedure to obtain "c" , and is independent of 
any of the above. 

It is a combination of the QKD protocol and the observed 
error "c" that determines the actual key rate, and this de- 
pends on the actual channel or state p "b" , which in turns 
cannot be better than the underlying resource "a" . This is 
why analysis of a QKD protocol often refer to the underly- 
ing resource "a" - the protocol, starting with resource "a" 
and subject to further unknown attack, will result in some 
potentially worse observed error rates that may still give 
the greenlight to establish a secure key. S e can be used to 
describe both concepts "a" and "c" . 
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To repeat, given resource "a" that is too noisy, QKD gives 
zero key-rate whether there is eavesdropping or not. On 
the other hand, no matter how good "a" is, too much 
eavesdropping should also give a zero key-rate. So, QKD 
is only interesting given good enough underlying quantum 
resource "a" together with a scheme to ensure security. 
The goal of this paper can be understood as characterizing 
what underlying resources "a" are good enough under our 
scheme. 

D. Statement of results 

In this paper, we report a new test procedure in step (2) 
for any S e containing (gi-twisted-e-good-pbits. Since Alice 
and Bob can use LOCC in QKD, our procedure also ap- 
plies to any S e containing states which can be converted 
into €)-twisted-e-good-pbits by LOCC. In particular, these 
include (1) <g)-twisted-e-good-pbits themselves, and (2) ten- 
sor power key distillable states. This new method does not 
require distilling entanglement and it applies independent 
of whether S e has distillable entanglement or is bound en- 
tangled. 

The protocol in this paper is similar to that in |21j . and is 
also a "twist" from the original Lo-Chau scheme. In the 
critical step of phase error estimation, we test for "twisted 
phase errors" (phase errors in the basis defined by the twist- 
ing operation) just as in [ST]. In [2T], the test is based 
on entanglement distillation and teleportation. Here, our 
new procedure uses a more recently found finite quantum 
de Finetti theorem with exponential convergence [B] and 
requires only local resources, measurement of product ob- 
servables (Def. [3]), and classical communication. This has 
significant consequences: 

(1) There are quantum channels that have zero quantum 
capacity but nonzero key capacity. Each set of states S e 
captures what Alice and Bob expect p to be. It summa- 
rizes deviations from perfect pbits including channel noise 
and noise inflicted by eavesdropping. For example, Alice 
and Bob can have prior knowledge of the presumably avail- 
able quantum channel (resource (a) in previous subsection) , 
which is susceptible to further attack by an eavesdropper. 
Our work extends QKD to the regime when this presum- 
ably available quantum channel cannot transmit quantum 
data, and only allows sharing of a bound entangled key 
distillable state at best (without an eavesdropper). With 
the unknown eavesdropping attack, the final key rate de- 
pends on the combined noise level, and can potentially be 
positive. (In the static case, there are states that are un- 
trusted and presumably bound entangled that can still give 
a secure key.) 

(2) Prepare-measure scheme based on private states. Re- 
markably, in the noiseless case, E91 is mathematically re- 
lated to many "prepare-measure" QKD schemes (P/M- 
QKD) including BB84. P/M-QKD only requires quantum 
states to be prepared and be sent by Alice, and be measured 
by Bob without being stored, thus, minimal coherent quan- 
tum manipulations. P/M-QKD has much practical advan- 



tage over distillation or purification based schemes, but 
the latter often have simple unconditional security proofs. 
Shor and Preskill |3j illustrated mathematical connections 
between the two types of schemes even in the noisy case 
for some EPP. Starting from the Lo-Chau security proof, 
they rederived one for BB84 similar to Mayer's. Reference 
[2"2l generalized the connection to more general EPP. Like- 
wise, our new test procedure allows the purification-based 
scheme to be transformed to a P/M-QKD. This can be 
useful in implementation. 

We note a side result that may be of independent interest - 
that average values of an observable in a bulk system can be 
estimated in a sublinear sample even when the observable 
cannot be directly measured. This will be discussed more 
in Section Ivni 

E. Related work 

As already noted, this paper is a follow-up of [21] on param- 
eter estimation of untrusted states relative to pbits. The 
scheme in |21j requires a small amount of distillable entan- 
glement - it does not apply to bound entangled states and 
thus cannot be used on states generated by a channel with 
zero quantum capacity. 

An earlier version of the current result (unpublished) used 
an exact but polynomial quantum de Finetti theorem [23j 
from which we obtained a much lower key rate. The new 
exponential quantum de Finetti theorem (exp-QDFT) in 
[6] provides much better bounds and properties. 

There are two intuitive solutions to the current problem 
of parameter estimation. The first is a state-tomographic 
estimation, which was suggested in [TJ, but the accuracy 
and security was not analyzed. It is interesting to note 
here that exp-QDFT provides exactly the tool for doing 
so. Whatever p is, Alice and Bob can simply choose half 
of the systems (or any linear amount) at random, and 
the chosen state p' is exponentially close to a mixture of 
"almost-power-state." S e can be chosen to be tensor pow- 
ers of key distillable states and the test for p' £ S e sim- 
ply involves state tomography using only measurement of 
product observables and classical communication. (During 
the final preparation of the manuscript, we heard of some 
work in progress using this approach [24j.) This paper fol- 
lows another intuitive approach - error estimation in the 
twisted basis, via a decomposition of the twisted observ- 
able into product observables (see Def. [3j . Intriguingly, a 
natural choice of the set of product observables is also to- 
mographically complete. However, discarding is not neces- 
sary here. The main challenge is a rigorous security proof, 
along with a careful analysis of how various parameters are 
related. We have used many different elements (including 
the exp-QDFT) in 6 , along with earlier techniques such as 
quantum-classical-reduction and various random sampling 
techniques, [4], [5], and also ideas from [21"] . 

In this paper, we have also emphasized various useful con- 
cepts, such as "harmless errors" and the structural con- 
stituents of QKD. Examples of harmless errors (most gen- 



erally defined by the private states) was observed in earlier 
works by Aschauer and Briegel [25] and was used in [26] . 
[27] . [2"8] to improve the key rate. Various useful structural 
descriptions of QKD, revolving more around P/M-QKD, 
have also been proposed before [5] , [29] , [6] . 

After the initial presentation of this result [3D], and dur- 
ing the preparation of the current manuscript, Renes and 
Smith [31] reported the following related result. The P/M- 
QKD scheme [32] that uses local noise inflicted by 
Alice to increase the key rate has an interpretation as a 
QKD scheme based on distributing and distilling a partic- 
ular private/twisted state. Thus they arrived at an (ex- 
isting) example of P/M-QKD based on private states (but 
the state has to be (ebit) distillable since the noise is lo- 
cal). This is complementary to our current result (item (2)) 
that aims at a general recipe to convert distillation-based 
schemes to P/M-QKD. 

Finally, [33] contains a summary of this paper without the 
technical details. 

II. Details of our result 

Recall that in the current formulation of QKD, the goal is 
to accurately test whether the shared bipartite state p is 
in some set S e or not, and if so, apply a transformation 
that will bring any state in that S e to a state close to 
a private state 7. The test and transformation use only 
LOCC. Note that p is determined by eavesdropping and 
the channel properties, while S e and the test is part of the 
design of the QKD scheme. We will describe and prove the 
security of a QKD scheme with S e containing <X>-twisted-e- 
good-pbits (see Def . d]) . 

As described before, our procedure also applies to any S e 
containing states which can be converted into <g>-twisted-e- 
good-pbits by LOCC, by prepending such transformation 
to our scheme. As an example, S e may contain tensor 
power of key distillable states cr®" for arbitrarily large n. 
Since a is key distillable, 3k such that cr® fe can be prepro- 
cessed by £t (via LOCC) to a state 5> that approximates 
some private state to some predetermined accuracy (the 
dimension of the key part is then 2 kr for some r > 0). To 
test if p € S e (i.e., whether p = er®"), Alice and Bob can 
first apply the preprocessing £®L™/ fc J ^ Q ^ followed by our 
estimation procedure for S e containing <5-®L™/ fe J^ wn j c j 1 j s 
a <X>-twisted-e-good-pbit. Clearly if p £ S e , the above test 
will pass with high probability. There are several subtle 
points concerning this reduction: (1) The key-rate can be 
suboptimal. (2) The preprocessing may prevent the QKD 
scheme from being easily converted to P/M-QKD schemes. 
(3) The dimension of the new key part, 2 kr , is finite but can 
be large for finite preprocessing precision, and the accuracy 
of our test has a strong dimensional dependence. 

We will also return to one other observation in Sec. lVIl that 
for a given state, it can be related to many different pbits 
(defined by different twisting operations). Consequently, 
the error rate of a state relative to each pbit and thus the 



key rate depend on the choice of the pbit being considered, 
and should be optimized. For now, we consider an arbi- 
trary choice, such as one arising from the knowledge of the 
available channel. Later, we will describe a simple method 
for the optimization in Sec. IVI1 

Both [2T] and this paper exploit the relation between e- 
good-ebits and e-good-pbits - they differ only by a change 
of basis (in particular, for £g)-twisted-e-good-pbits, the 
change is simply given by the tensor power of the single- 
system twisting operation Eq. [[2"])). That the twisting is 
not an LOCC operation, of course, changes all the nonlo- 
cal resource accounting. But surprisingly, as we will see, a 
variation of the Lo-Chau scheme is invariant under twist- 
ing, except for one step. So, we detail how and why the 
Lo-Chau scheme works, and explain how that exceptional 
step can be circumvened. 

A. Concepts in tolerable attacks 

Core to the analysis of QKD using noisy resources is a 
notion of tolerable adversarial attacks, which are quantified 
by the parameters to be estimated. (For example, these are 
chosen to be the number of bit-flip and phase-flip errors in 
the transmitted qubits in many schemes.) We make this 
notion precise in the following, and develop notations used 
throughout the paper. Consider an n-qubit system. Let E 
be the Pauli group acting on it (parameter n omitted). For 
each Pee, up to a scalar factor in {±1, ±i}, P = a^a* 1 <8> 
°~x 2 °~z 2 ® ' ' ' cr x" cr z" where a x . z are the generators for the 
qubit Pauli group, and Xi, Zi £ {0, 1} are matrix exponents. 
It will become clear that the scalar factor is irrelevant in our 
work, thus each P is represented by the two n-bit strings 
x = (a?i, X2, • • • , x n ) and z = (z\, z%, ■ ■ ■ , z n ), which we 
will call the "A- and Z-componcnts" of P. The number 
of l's in a bitstring is called its Hamming weight. Let 
e = (e x ,e z ). They will represent two error rates critical in 
the security of QKD. Collect all P's in E that have X and 
^-components with Hamming weights no greater than ne x 
and ne z into a set E e , and denote the linear span of E c (over 
C) by SE e . The eavesdropping attack of current interest, 
described as a trace-preserving completely-positive (TCP) 
map, is of the form 

v £ ( P ) = j2 E kp4 (3) 

k 

where Ek <E SE e for all k and where the usual trace- 
preserving condition J^k ^l^k = I holds. Note that 
e^jEz < 1, and when equality holds, SE e is the set of all 
bounded operators, thus, any eavesdropping attack is of 
the form Eq. (J3J for sufficiently large e x ,e z . 

For the case of qubit transmission, we omit the d — 2 in 
the notation for the maximally entangled state and 
Using Eq. ([3J, we make the important definition: 

Definition 1 (e-good-ebit) We call the state V t (^ n ) "n e- 
good-ebits" , where V t acts on the n qubits of Bob and the 
identity map acts on the n qubits of Alice. 

Note that e-good-ebits are not necessarily tensor power 
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states. We now define the analogue in a twisted basis: 

Definition 2 (<g>-twisted-e-good-pbit) We call the state 
U® n [P t {<5>® n ) <g> Panc]^ 1 ®" "n ®-twisted-e-good-pbits", 
where U is a twisting operator given by Eq. ((2|), and the 
ancillary state p anc can be arbitrary over all the ancillary 
systems (A'B')® n . 

B. The Lo-Chau scheme 

The Lo-Chau scheme focuses on the d — 2 case. Alice uses 
the channel n times to send Bob's halves of n ebits she 
prepared locally. In the absence of Eve, p shared after step 
(1) should differ from by the channel noise. It thus 
makes sense to use e = (e x ,e z ) as the parameter e in S e , 
and define S e to be the set of all n e-good-ebits. Here, 
e x , e z are called the bit and phase error rates respectively. 
When eavesdropping is possible, Alice and Bob need to 
determine e for which p e S e with high probability. In 
security proofs, we do not lose security if we assume less. 
So, we let p be completely unconstrained and allow Eve to 
possess the purification of p. (A purification of a mixed 
state p on a system si is a pure state on two systems Sl,s2 
such that tracing out the extra system s2 will give p. The 
purifying system s2 contains all information related to si 
outside of it.) Since {P|<I>)® g}p € e is a basis for H(ab)®™, 
the purification of p, \^i), has the form 

\*i) = y Za P (P\$)%)®\ep) B . (4) 
PeE 

Here, P ranges over all possible n-qubit Pauli operators in 
E and acts on Bob's n qubits, ap are arbitrary amplitudes, 
and \ep)E are normalized states on system E. 

Step (2) in the Lo-Chau scheme is carried out by estimat- 
ing the error rates e x , e z by random sampling of the AB- 
sy stems without replacement. To estimate e x , m systems 
are chosen randomly and o~ z (g> o~ z is measured on each of 
them, and the estimated e x is the number of —1 outcomes 
divided by m. (Note that the outcome of measuring a z ®a z 
on an ebit should be +(— )1 when there is zero(one) a x er- 
ror). In other words, one measures A, the eigenvalue of 

J2iLi( a z ® ct z)ab where (i) denotes the ith sampled sys- 
tem, and estimate e x to be ^(1 — — ). (Similarly for e z .) 
We will next describe the estimation process in two ways, 
a simple abstraction and the actual implementation, and 
we show that they are equivalent. 

In the abstract, the error estimation transforms the state 
to 

]>>p(P|$>^)5>P, e | e } ®|e P ) £ (5) 

P e 

where the experimental estimate of the error rates of the 
QKD execution, |e) = \e x ,e z ), is in the system O avail- 
able to all three parties. In a good estimation procedure, 
the estimated error rates should not deviate significantly 
from the actual values, except with very small probabil- 
ity. Let ne x p, ne z p be the Hamming weights of the X- 



and Z-components of P. A good estimation translates to 
the mathematical statement that, for each e x p,e z p, the 
sum of |/xp j£ | 2 over e should be small whenever \e x p — e x \ 
or \e z p — e z \ is significant. Reference [5] provides a test 
procedure for the Lo-Chau scheme based on random sam- 
pling that achieves the following: For small S and for 
m < ( jT§p )n, we have 

PT(\e xP -e x \ >5)<f(m,S) (6) 

where f(m,8) := 2exp(— 2m<5 2 ) with natural exp rather 
than base 2, so that |Mp,e x , e J 2 < 2/(m, <5) if maxde^ — 
£xp\, | £ z — CzpI) > S. In Eq. ©, the probability is over 
the random sample taken for the error estimate. (See [34] 
for a derivation of Eq. ([6]) from [5].) To achieve good es- 
timation is a central aspect of QKD. The proof in [5] is 
subtle - measurement of P commutes with measurement 
of e so that whether the former is done cannot change the 
distribution of the latter. So, we can assume measurement 
of P has been done here. Most importantly, such assump- 
tion applies even to actual indirect measurements of e that 
may not commute with measurement of P, as long as the 
indirect measurement gives accurate results, and all inter- 
mediate results (except for the final outcome) are discarded 
(see argument to follow) . This imagined measurement of P 
turns both P and e into classical random variables so that 
classical random sampling theory can be applied. 

In real experiments, there are two differences from the 
abstraction. First, the intended measurement operators 
o~ x ® °~x and o~ z ® a z are nonlocal (these are parity mea- 
surements in the conjugate and computational bases) but 
they are implemented via local measurements, for example, 
the eigenvalue of o~ x Cg) a x is obtained by measuring that of 
o~ x ® I and I ® o~ x on the properly paired AB systems and 
classically taking the product of the two outcomes (±1). 
Second, the 2m random samples will be irreversibly mea- 
sured out. We want to replace the analysis of the real ex- 
periments by that of the abstraction, and we now show that 
such replacement is valid if we impose certain conditions 
on the protocol, as detailed in the following observation: 

Observation 2: If Alice and Bob perform (1) local (demo- 
lition) measurements, (2) classical communication of the 
outcomes, (3) classical postprocessing and output a func- 
tion of the outcomes, (4) discard all measured systems, all 
intermediate outcomes and communicated messages, then 
the procedure is equivalent to a direct measurement yield- 
ing the function (and nothing else) and discarding the mea- 
sured system. 

Proof: Local measurements can be made "coherently" so 
that the outcome is stored in the computational basis 
of an ancilla without being read. Classical communica- 
tion from Alice to Bob can be modeled as the isometry 
\x)a — > \x)a\x)b\x)e where Eve's copy ensures classicality 
and generality of the security argument. Similarly for clas- 
sical communication from Bob to Alice. Then, Alice and 
Bob each performs the classical postprocessing (locally) to 
derive the same intended measurement outcome. Besides 
this, they discard everything else, i.e. they give to Eve 
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all measured systems, their copies of the coherent classical 
communication, and the workspace of the classical-post- 
processing, and Eve can reconstruct the postmeasuremcnt 
state. The entire procedure is thus equivalent to the desired 
direct measurement. 

Keep in mind not to use the 2m samples again, we can 
analyze the state in Eq. ([5|) in the abstract setting. This 
state can be rewritten as 

£ X> (mr B ) k)o ® \e P ) E + i*bad> (7) 

P e 

where the primed sum of e is now restricted to those terms 
in which e x , e z are S close to e x p, £zP respectively, and the 
unnormalized l^bad) contains all other terms with bad es- 
timates. The important point is that |\&bad) has norm 
squared bounded by 2/(m, 6). (To see this, label the sum 
over those e by a double prime, and EpEel^fMPel 2 = 
Epj«pi 2 E>P,e| 2 < E P M 2 2/(rM) < 2/(m,5)). 
We include this bad term in our equations to keep track 
of the entire picture but we need not worry about its evo- 
lution. 

In step (3), based on the estimates e x ,e z , Alice and Bob run 
any applicable EPP (e.g. see [16], [22]) on the unmeasured 
n — 2m systems. In the abstract, the state becomes 

EE^ ^ (P* |$)^T 2mK IS.) + ft |be» \e)o\e P ) E 

P e 

+ |*bad) 

To obtain the above expression, note that the output of 
EPP depends on P, e, and random inputs of EPP. Taking 
a coherent description for the local coins, and focusing on 
one (P, e)-term in the primed sum where error estimate is 
accurate, EPP produces an output with high fidelity with 
respect to \<&) ( ^-g~ 2m ^ r ' where r e is the entanglement rate 
depending mostly on e (and slightly on n—2m for finite ef- 
fect, and finally, negligibly on the local coins because this 
effect can be removed by lowering the rate slightly). We 
collect the rest of the system into a sufficiently large auxil- 
iary space. Uhlmann's theorem [35] guarantees an output 
state for the (P,e)-term in the form inside the parenthesis: 
the auxiliary output states _2m ^ r,! |g e ) and the bad 

EPP term |b e ) are orthonormal, with /? gib > 0, /3| + /?{; = 1 
and /?b upper bounded by a function exponentially decay- 
ing with n [16], [3], [22], [36]. If e x ,e z are too high, r e — 
and implicitly QKD is aborted (yet preserving security). 
The incoherence between the different P, e terms can be 
absorbed into the auxiliary system. In real experiments, 
EPP is done incoherently, but as long as Alice and Bob re- 
frain from using anything other than the final output (i.e. 
discarding everything else) the abstract picture will hold. 

Finally, Alice and Bob measure out a key from the 
(AP)®(" -2m ) re systems, which has high fidelity to ebits 
(when conditioning on other systems is NOT made). This 
guarantees security [8] in the universal composable def- 
inition [9], [8]. In particular, let the ideal state be 



IV'idoai) = Y,p,e a P^P,e |$)®Slge)|e)o ® \ep)s, and the 
output in the last equation be factual)- Then, the 
QKD (in)sccurity parameter in [5] is upper bounded 
by yjl - K^idoail^actuai)! 2 < y/^f{5,m) + fi\ (because 
K^ideall^actuaOl > (1 - 2/(5, m))0 g ). Roughly speaking, 
it means that if an ideal key used in any application is 
replaced by the one generated in the QKD protocol, no 
attack involving all parts of the application can achieve a 
statistical difference better than the stated insecurity pa- 
rameter. 

Note that if better EPP protocols are found and used 
in QKD (more rapidly vanishing (3b) the above analy- 
sis implies corresponding improvement in the key rate 
and security of the resulting QKD. As a concrete exam- 
ple, [22] presents a scheme that achieves a key rate of 
1 — H(e x ) — H(e z ) where H is the binary entropy func- 
tion and e x , e z are the observed error rates, and is nonzero 
for e x < 1/2 if e z w (and vice versa) or e x = e z < 0.11. It 
follows from subsection II-CI that underlying states or chan- 
nels with less error has the potential to establish a secure 
key. 

We end this section with a definition for a useful concept 
we came across: 

Definition 3 (Product vs nonproduct observables) A prod- 
uct observable (with respect to systems SI, S2) is one of 
the form Osi <8> 052- While nonlocal, it can be measured 
using LOCC: perform the individual local measurements 
Osi ®Is2i Isx®Os2, exchange the classical outcomes, and 
calculate the product. 

C. Replacing EPP by EC/PA 

In [3], [22], classes of entanglement purification protocols 
(EPP) were found to have a very nice property when used 
in EPP-QKD. If Alice and Bob apply EPP followed by fi- 
nal measurements in the computational basis to extract a 
key, their many steps can be rearranged without chang- 
ing the security of the final key. In particular, the rear- 
ranged protocol has the computational basis measurement 
done first, generating what is called a "sifted-raw-key" (the 
adjective "sifted" is only useful later in the mapping to 
P/M-QKD). The steps of the original EPP become (classi- 
cal) error correction (EC) on the sifted-raw-key followed by 
privacy amplification (PA) to generate the final key. Such 
EPP include 1-EPP protocols corresponding to CSS codes 
(i.e. involving only parity checks entirely in the computa- 
tional basis, or entirely in the conjugate basis), and also 
2-EPP protocols that are CSS like, symmetric with respect 
to exchanging Alice and Bob, and with each step depending 
only on prior measurement outcomes in the computational 
basis. We will call such schemes EC/PA-Lo-Chau schemes, 
which, from now on, are always being considered in place 
of the original Lo-Chau scheme. 

References [3], [22], [5] provide recipes to convert EPP- 
QKD to the simpler P/M-QKD. We will first describe a 
pbit-distillation based QKD scheme (PPP-QKD) and pro- 
vide a security proof in the next section. In Scc. lIIIi we out- 
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line a conversion to P/M-QKD for our PPP-QKD scheme. 

D. QKD based on ®-twisted-e-good-pbits 

We first consider the d = 2 case in direct correspondence 
with the EC/PA-Lo-Chau scheme, again omitting the d = 
2 notations in 7^ and |3><j). 

After step (1), Alice and Bob are sharing untrusted state 
p, and in step (2) Alice and Bob want to test if 3S e such 
that p G S e , where 

S e = {U® n (7> e (*§5) ® P-c) U^ n } (8) 

is a set of ®-twisted-e-good-pbits (see Def. [5]) for some U 
satisfying Eq. (JSJ) and some arbitrary ancillary state p anc 
on (A' B')® n . In principle, Alice and Bob only need to find 
e, but not U and p anc as long as they exist. We will see that 
the protocol is independent of p anc - For now, we assume 
they make a certain guess for U and we will come back to 
remove this requirement in Section [VII 

Consider the following unfeasible scheme: first untwist, i.e., 
apply [7®"t to then apply the EC/PA-Lo-Chau scheme. 
This is equivalent to running the EC/PA-Lo-Chau scheme 
in the case of S £ = {V t ($' AB )} and thus it is secure. The 
problem is that untwisting is global and requires resources 
unavailable in real- life QKD. Our strategy is to write down 
(mathematically) this secure but unfeasible scheme as a 
first step. Then, we explain security-preserving modifica- 
tions that make the scheme feasible using the usual re- 
sources allowed in QKD. In short, this is possible because 
only one step in EC/PA-Lo-Chau scheme is affected by the 
twisting and untwisting operations (see also 21]). The ex- 
ceptional step is the estimation of e z in the twisted basis. 
In [21] , it was handled by first distilling some ebits followed 
by teleportation of a small number of test system to enable 
untwisting. Here, it will be handled without distilling ebits. 

In detail, this secure but unfeasible protocol runs as follows: 

(2) Apply untwisting £/® n T to p, then estimate e x and e z on 
the (AB)® ra systems (by using m x and m z random samples 
respectively), and finally reapply U® n . 

(3') Apply untwisting {/® n t ) measure out a sifted-raw-key 
in the n—m x —m z systems. 

(4') Perform error correction and privacy amplification on 
the sifted-raw-key via 1- or 2-way public discussion. 

We now explain how to transform the above protocol to 
one involving only measurements of product observables 
and classical communication, and in particular, without the 
distillation of ebits. In step (2), only a random subset of 
m x +m z systems are measured. On the other n—m x —m z 
untested systems, the untwisting and twisting cancel out 
(thus can be omitted). On the tested systems, for the es- 
timate of e z , untwisting, measuring — Yn^ii^x ® (j x) < ab 
and twisting (where i is a label of the tested sample) is 
equivalent to the measurement of F^? 1 '■— ^~ Fx 

where T x = Uaba'B'{o x ® a x ® Ia' B')U ABA , B , . Here, T x 
is a nonproduct observable (see Def. [3]) and generally, it 



cannot be measured in a one-shot manner using LOCC. 
However, our goal is to estimate e z by measuring the com- 
bined Li d a c v al , which is an average of T x over many different 
systems, and for this purpose, we can apply some other 
LOCC measurement, the method and the accuracy will be 
given in the next paragraph. For the estimate of e x , the 
twisted observable L 2 = U abA'B'{?z ® a z ® I)U\ BA , B , is 
simply a z <g) a~ z ® Ia' b' because o~ z commutes with U and 
W . So, the original analysis of [3] holds, and m x samples 
are used for this estimate. In step (3') and (4'), the com- 
putational basis measurement to obtain the sifted-raw-key 
and the rest of the classical postprocessing all commute 
with fj® n t. Thus, we can finish the entire QKD protocol 
before the untwisting, which then clearly does nothing and 
can be omitted. 

LOCC estimation of e z via product observables: 

The goal is to replace a measurement of F^r := 
— Ya^i Fx by an LOCC measurement of product observ- 
ables, such that the outcomes have similar average values. 
We denote the probability distribution of the outcome of 
measuring F^? 1 by /iidcai, and that of the LOCC measure- 
ment of product observables by p\ OCCl and their averages 
by /Sideai and p,\ occ . If the state being measured is fixed, 
Mideai and /Iidcai are fixed, but p,\ OC c and p,\ occ are random 
variables depending on the measurement outcomes. 

We now explain the LOCC measurement that generates 
Miocc- First, obtain a decomposition for the single system 
observable into product observables: 

F x = Uaba<b>(o-x®o- x ®I)U\ ba , b , (9) 
t 

= S 3a.h jaAA' ® O jh BB> (10) 

where {Oj}*- =1 is a basis (trace-orthonormal) for hermitian 
operators acting on AA' and BB' , and t — d 2 d'. Second, 
Alice and Bob divide their m z samples into t 2 groups. They 
use each group for one pair of (j a ,jb), and they obtain 
a measurement outcome denoted by Ont m ^/ t ^[Oj a AA' ® 

Jb BB>] of the observable YJi=\ (°jaAA> ® O jbB B') {t) - 
This is related to a sum of product observables, and can 
be measured in LOCC as mentioned before (Alice and Bob 
can individually measure Oj a AA' and Oj b BB' on the ith 
test system, multiply their results via LOCC, and finally 
sum those products over i = 1, • • • ,m z /t 2 ) and take av- 
erage. Also, let J2 Ja ,j b s jo.3b 0ut m 2 /t 2 [0 ]a AA' ®Oj b BB>] be 

the "outcome" of the LOCC estimation of the phase error 
rate, defining a distribution p\ occ . 

Is /Siocc close to /lideai that is generated by measuring I~™ 
directly? It will be if the entire m z sample systems are in 
a joint tensor-power state, and if m z /t 2 is large enough 
(because Chernoff-like bounds will hold and because of 
Eq. (jlOp ). However, in our current problem, Alice and Bob 
share p which is not a tensor-power state. Fortunately, 
first, by means of random sampling, we can assume per- 
mutation symmetry in this analysis, and second, since the 
estimation involves only a small portion (m z ) of the entire 
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n systems, the exponential quantum dc Finctti theorem [6J 
states that the measured (reduced) state is close to a mix- 
ture of "almost-tensor-power-states" so that the Chernoff- 
like bounds will hold and the estimate will thus be accurate. 
The exact analysis involves many adaptations of the results 
in [5J. In the appendix, we prove a more general theorem 
(Theorem[3]) for any observable O on one copy of the bipar- 
tite Hilbert space in place of T x , for any dimensions, and 
for /ijdeal generated by measuring i YZ=i ° W and Mlocc 
generated by the above LOCC procedure. We obtain an 
upper bound for Pr(|/Iidcai — Mioccl > S) as follows. 

Adapting Theorem [31 we write the symbol in the appendix 
on the left hand side of the arrow, and what it should be 
in the current context on the right hand side. We choose 
the parameters as d — » cPd' , t — > t 2 = d 4 d' 2 , n + 2m — * n, 
m — > m z , 5 — > <5/3, \\T x \\hs = ( smce T x is unitary). 
Then, 



Pr( l^idoal - Mloccl > S) 
(n-m,)(r+l) 1 4 , 2 
< 2e~ 2^ + 2 d d Hn~™*) 



+ (t z + l) 



1)2 [ 



:!(>/ 



2e 144d'd 2 



(11) 



where the three expressions in the upper bound respec- 
tively come from the exponential quantum de Finetti theo- 
rem, the Chernoff bound, and random sampling theory. (In 
the last term of the above, we have used a tighter bound 
given directly by Proposition[T]instead of the general bound 
in Theorem [3j ) Also, throughout the paper, H(-) denotes 
the binary entropy function. 

Furthermore, by the sampling theory Proposition [1] in the 
Appendix, e x can also be estimated with m x samples to 

the accuracy Pr(| e x ,p — e x \ > S) < 2e~ 16 . 

Putting these altogether, Pr(| e Zj p — e*| > 5 or | e Xj p — e x \ > 
6) < f(rn x ,m z ,S) for 

f(m x ,m z ,S) 

(n-m z )(r+l) \ . , 2 

+ 2e 2^ + 2 d d M™-™*) 



+ (t 2 + 1)2 

m z S 2 



36tW""0 "w+ d ' d2 



2e 144rf'd 2 



(12) 



The composable security parameter will still be less than 
y/Af{8, m x , m z ) + 0$> as derived in the summary of the 
original Lo-Chau scheme. 



Now, we state parameters that will make \Jif{&, m x ,m z ) + fi\ 
exponentially small in some security parameter s. Note 
that /3b is unaffected by our modification to the EC/PA- 
Lo-Chau scheme, and we focus on the f(S, m x , m z ) portion. 
We choose some security parameter s and make each term 
in Eq. (fT2l) exponentially small in s (w 2~ s ) by the follow- 
ing choices (with each item corresponding to each term in 



order). 

(1) Take sample size for bit-error rate to be m x = s X if. 

(2) Generally, since m z has to be small compared to n, 
thus, r = 4s and r > d 4 d' 2 Inn. 

(3) m' :— m z /t 2 should be large (at least 0(s)), while 
r/m! -C 1 and m' 3> logm'. In particular, say, H(r/mf) < 
S 2 /(72t 2 d 2 d / ) and m'5 2 /{12t 2 d 2 d') > 2d! d 2 log(m'/2 + 1) 
and m! >sx U4t 2 d 2 d'/S 2 - 21ogt. 

(4) m > > ^+11^. 

Clearly, for s ranging from constant to linear in n, there are 
corresponding choices of r, m z , m x that will work. Roughly 
speaking, m x > 0(s/8 2 ) and m z > O(logn) will be the two 
asymptotic requirements when d, d! are fixed. The final key 
rate can be given by (1 — (m x + m z )/n) x Re xy e z where the 
first factor is due to the use of private states and the re- 
sulting more complicated error estimation procedure (but 
lower bounded by the above choices of m x and m z ) and the 
second factor depends on the observed twisted error rates 
(that can be much lower than that relative to ebits and 
where our protocol provides an advantage) and the choice 
of EPP (or EC/PA procedure) lower bounds of which are 
extensively studied in QKD based on ebits. 

With this analysis of the accuracy of the estimation, and 
following from earlier discussion, the security proof for the 
QKD protocol is completed. 

III. Prepare and measure scheme 

In the previous section, we have provided a security proof 
of the pbit-purification-based QKD (PPP-QKD) protocols 
in which the parties are processing an untrusted shared 
state and are extracting a key from it. Typically, the pro- 
cessing requires quantum memory, and some times, coher- 
ent operations on the quantum state. As we have noted, 
entanglement-distillation-based protocol (EPP-QKD) are 
closely related to the much simpler P/M-QKD. We will 
thus convert our PPP-QKD protocol to a P/M-QKD 
scheme, adapting to pbits earlier works based on ebits [3], 

pa, Eg. 

In PPP-QKD, the initial state is completely arbitrary. In 
P/M-QKD, Alice first prepares the state, and then Eve 
attacks it. Thus, the state f> is more restricted. In partic- 
ular, we focus on tensor power states prepared by Alice - 
the most physically relevant case because of the simplicity 
in implementation. 

Since our protocol already has the distillation steps re- 
placed by EC/PA, there are only 2 coherent steps to mod- 
ify: (1) distribute p via an untrusted channel and (2) esti- 
mate e x , e z on the sample systems and measure the rest in 
the computational basis to generate the sifted-raw-key. 

We now dissect these two steps. In most of the useful cases, 
in step (1), Alice only needs to prepare a tensor power state 
Pq U over the n bipartite systems and send each of Bob's 
halves via one use of the given untrusted channel Af. They 
are expecting to share the state (I (£> J\f(po))® n while they 
are actually sharing p = (T® n eg) £(pf n )) for an arbitrary 
joint attack £ by Eve. For step (2), recall that it suffices 
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to perform measurements of product observables on indi- 
vidual system. Now, focus on such a measurement of some 
O a ® Ob on one of these systems. Let {IV'z)}; be a com- 
plete set of eigenvectors of O a . Note that Alice's measure- 
ment on each of her halves of the state commutes with the 
transmission via the channel and Bob's measurement. So, 
she can measure first, before sending each of Bob's halves, 
without affecting the security. It means that she sends the 
state ti aa'[(^Pi){^i\aa> ® Ibb>) Poaa'bb 1 } (unnormalized) 
with probability which is the trace of that state, followed 
by a measurement of Ob by Bob. Thus, a conversion to 
P/M-QKD can be obtained, with a caveat. 

The problem is that in purification-based QKD, each pair 
of local measurements for each system is chosen proba- 
bilistically (from Eve's point of view) and with perfect co- 
ordination between Alice and Bob. When converting to 
P/M-QKD, various naive options fail or come with ex- 
tra requirements: (a) If Alice announces her measurement 
before Bob signals receipt of the states, Eve could have 
intercepted the transmitted state, performed Bob's mea- 
surement, resent the postmeasurement state to Bob, and 
completely evaded detection, (b) If Alice waits until Bob 
signals receipt of the states, before announcing her basis, 
and then Bob makes his measurements, he will need quan- 
tum memory to hold his received states, (c) If Alice's bases 
annnouncement is encrypted with a private key, it has to be 
of length roughly 0(m z logn) = 0(s polylog(n)/<5 2 ) where 
s is the security parameter of the QKD protocol (which 
can range from constant to linear in n) . In comparison, an 
initial key is also required for the authentication of some of 
the classical messages. It is an open problem what is the 
minimum authentication requirement. If one authenticates 
all of the bases information, the identity of the states in the 
test samples for parameter estimation, and forward com- 
munication in EC/PA, it will take O(logn) key bits. Thus, 
for high security parameter requiring m z to be growing 
with n, encryption of the bases information qualitatively 
increases the the amount of the initial key required. 

The initial solution in BB84 was to have Bob guessing 
the measurement basis, and postselect those with properly 
matched measurement basis. The price is a lower key rate. 
The method was improved on by [5] so as to preserve the 
key rate asymptotically. We will adapt this technique in 
our protocol. 

The idea in [5] is that, even though randomness in the 
measurement basis is necessary for security, only a small 
fraction needs to differ from the computational basis to 
have sufficient confidence in the estimates of e x ,e z - some- 
thing we have already exploited in the PPP-QKD scheme 
in the previous section. Here, Alice and Bob will inde- 
pendently pick a large enough fraction 0(n c ) of the n sys- 
tems to be measured for each Oj (the orthonormal basis 
for operators acting on each of the local systems AA' and 
BB'). These samples are chosen randomly and with high 
probability over the choice, for each pair (j a ,jb), the ob- 
servable Oj a <g> Oj b would have been applied to a fraction 
0(n 2c ~ 2 ) of all systems, giving 0(n 2c_1 ) random samples. 



Remember the requirement m z > spolylog(n)/^ 2 , so that 
n c > y / snpolylog(n)/<5 will provide sufficient overlap for 
calculcating (Oj a AA' ® Oj b BB') and subsequently e z . For 
the protocol in the previous section, the local dimensions 
are d\fd' and each of Alice and Bob have t = d 2 d' — 1 trace- 
less observables to measure locally. Thus 0(tn c ) sw o(n) 
systems will be used for estimating e z and the rest can all 
be measured in the computational basis (for estimating e x 
and for the (unsifted) raw-key generation) thus the key rate 
of the original PPP-QKD scheme is preserved. Finally, by 
the procedure to turn a measurement of Alice and po into 
an ensemble of signal states, the conversion to a P/M-QKD 
is completed. 

We note that the above procedure can be suboptimal, es- 
pecially if t is large. For example, if the decomposition 
of the single system observable T x has low Schmidt rank 
t' <C t 2 in the Hilbert-Schmidt decomposition, then, effec- 
tively, only t' local observables have to be measured. Also, 
data from unmatching bases can be potentially useful but 
in the current scheme they are discarded for simplicity of 
the analysis. These, and other optimization, are issues for 
future research. 



IV. A CHANNEL WITH ZERO QUANTUM CAPACITY AND 
NONZERO KEY RATE 

Recall that there are key distillable but bound entangled 
states pQ, (T7J, [37] • Using results in the current paper, 
they can be verified and therefore can have nonzero rates 
of generating unconditionally secure key. The adversarial 
setting is totally unconditional, as described in Subsection 
II- CI Based on one of these states, we construct a channel 
that has zero quantum capacity and nonzero key rate. 

The channel is defined as follows. According to Section 
IIII1 Alice prepares a tensor power state pf n over n bi- 
partite systems and sends each of Bob's halves via one 
use of the given untrusted channel Af. They are expect- 
ing to share the state (I <g> Af(po))® n while they are ac- 
tually sharing p = (I® n ® £(pf n )) for an arbitrary joint 
attack £ by Eve. We choose X ® Af(po) to be pn from 
[57] (the definition will be given later). This state has 3 
desirable properties. (1) pu has a maximally mixed re- 
duced state on AA' . Thus it can indeed be written as 
p H = {Iaa> <S> Af B B>)($dAB ® ®d>A>B>) for some channel 
Nbb>- (2) ph is PPT (having positive partial transpose 
[19] ) and is thus bound entangled. Since pn is bound en- 
tangled if and only if Af is entanglement binding (with zero 
rate to create entanglement for any uncntanglcd input) 
( 38J), TV has zero quantum capacity. (3) On the other 
hand, if verified, pu has nonzero key rate. Correspond- 
ingly, in the absence of eavesdropping, Alice and Bob can 
use Af to distribute copies of pn, verify them, and generate 
a key. Thus pu and Af provide the example we are seeking. 

We now define the state pn- Recall that for a pure state 
\ip), we use the shorthand ip for the density matrix |^)(^;|. 
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Define the four Bell states as 
1 



hM = -^=(|00)±|H)) 



1 



|V>2,3> = -/|(|01>±|01» 

with the projectors given by ipi. Define also the states 



(13) 
(14) 



| X± ) = i(V / 2±^|00)±V / 2TV2 |11)) (15) 
Then, for k sl small parameter to be defined later, take 

PH = (1 ~ «) ^ qj^iAB ® Pa'B' + K ( 16 ) 

i 

where q = qi = \ , q 2 = q 3 = and 

P w = hoo){oo\+H 



pW =-[|ll)(ll|+^ 3 ] 



,(2,3) 



x+.- 



(17) 



For p — Y+^f ' an< ^ K — 0, P-ff = Ph wnere T 2 denotes 
the partial transpose of the second system. In particular, 
Ph is PPT. When p ^ r+yf > *he ^ K term need not be 
PPT anymore, but choosing a small nonzero k will give a 
corresponding neighborhood of p for which pn will remain 
PPT and thus bound entangled. Here, we claim that there 
is an untwisting operation of the form given by Eq. l[2]l 
that we can apply to pn, so that further tracing of A'B' 
will give us the state 



CAB = (1 - At) (pipQ + (1 - p)lp2) + K{ 



(18) 



Note that the transformation of the n term is straightfor- 
ward and also k — > 0, so, we can focus on the (1 — k) term. 
Let V\ be any unitary that transforms the following states 
as: 

|00) |00) , |11) |11) , \fo) -> |01) , |V> 3 ) -> |10) (19) 

and V2 transforms |x+,-) to |00), |11) respectively. 
exist, because they preserve orthonormality of the input 
space. Then, the untwisting operation can be defined as: 

U H = {\H){H\ab®cj zA , ®I B ) x 

((|00)(00| + |11)<11|)ab®T'W 

+ (|01)(01| + |10)(10|)ab ® V 2A >b>) (20) 

This is because the right hand factor first transforms 
the p^3, 2,3) component in p H to §(|00)(00| + |0X><01j), 
i(|ll)(ll| + |10>(10j), |00)(00|, and |11)(11|, respec- 
tively, and the subsequent left hand factor (effectively a 
controlled-(7 2 from A' to A) turns ipi.3 into -0o,2 respec- 
tively. This proves the assertion that the untwisted state 
is given by Eq. (fl8|) . The exact key rate in an execution 
will depend on the observed error rates t Xl e z , but it can 



potentially be close to being given by Eq. (|T8|) . In this case, 
for very small k and p > 1/2 (we have taken p w 0.5858), 
certain EPP (1-way asymmetric CSS EPP) studied in [To] . 
[22] can distill a key from p at a rate 1—H(e x )—H(e z ) which 
is w 1 - ^(0.5858) w 0.0213 according to Eq. (TH). Thus 
there exists a corresponding EC/PA procedure to generate 
a key from our protocol. This completes the proof that an 
untrusted channel, supposedly Af, is entanglement binding 
but can have nonzero key rate. 

As a side remark, the first term of pn represents as mix- 
ture of two pbits with a common twisting operation (no 
constraint on ancillary state) but differing by a bit-flip of 
the underlying cbit. EPP on these mixtures are particu- 
larly simple. 

Let us also understand this channel Af in more operational 
terms. Recall that a channel is completely determined by 
the state 1 <g> J\f($d) where d is the input dimension of Af. 
Thus, any correct way to transform $ab <B> &a> b' to pn 
via operations on BB' will be a valid description of how Af 
acts. Thus consider the following sequence of operations: 

(1) With probability p/2, measured B' in the computa- 
tional basis, (a) If outcome is |0), do nothing, (b) If out- 
come is |1), apply a z to B. 

(2) With probability p/4, apply <j z b ® c y B'- 

(3) With probability p/4, apply <j x b'- 

(4) With probability (1 — p), measure B' with the following 
POVM 



Mn 



Mi = 



(V2+V2) 











UV2-V2) 



KV2 + V2) 



(21) 
(22) 



(a) If the outcome is "0", apply o x b- (b) If the outcome is 
"1", apply (jy B <8) <J Z B'- 

It is immediate that cases (la), (lb), (2), (3), (4a), and 
(4b) give post-measurement states (systems labeled by 

aba'b'): Vo (8 |oo)(oo|, Vi «> |n)<n|, ih ® ^ V>o ® 

^2 <S> x+, and -0 3 (g) respectively. Also, both measure- 
ments yield equiprobable outcomes. Thus, the probabilities 
for all these cases are p/4,p/4,p/4,p/4, (1 — p)/2, [l—p)/2. 
Mixing up the states from all the cases gives exactly the 
(1 — k) term of pu . To incorporate the negligible n term, we 
can take convex combination of the above with the com- 
pletely randomizing channel on BB'. While the bit and 
phase error rates are tricky to define, any simple attempt 
will yield amusingly high numbers. 



Finally, we can use the recipe in Sec. IIIII to obtain a cor- 
responding P/M-QKD scheme. The initial state po in 
this case is $ab <S> &A'B'- For the estimation of e 2 , Al- 
ice will measure product Pauli operators on AA' because 
they form the desired orthonormal basis for traceless ob- 
servables. The actual states transmitted are exactly the 
equiprobable ensemble of the six eigenstates of & x .y, z on 
each of B and B' for the 0(y/ns) test systems. The rest of 
the systems are prepared in random states in the compu- 
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tational basis. Likewise, Bob measures 0(^/ns) systems in 
the eigenbases of cr x ,y,z and the rest in computational ba- 
sis. Note that the ensemble of states sent and the measure- 
ments on Bob's side are exactly those of the efficient ver- 
sion of the six-state protocol. Here, a completely different 
interpretation via pbits, and a corrrespondingly different 
classical postprocessing scheme yield drastically different 
results (the key rate will be zero otherwise). 

V. Summary of protocols 

We summary our PPP-QKD protocol and the P /M-QKD 
protocol in the following. We do not repeat why it is secure 
and omit the security parameters that are covered in Sees. 

H] and eh 

We refer to n copies of bipartitite systems AA' BB' with 
dim(AA' BB') = D. Alice and Bob preagree on some prod- 
uct basis of D 2 Hermitian operators Oj a ® Oj b . Let 0\ be 
diagonal in the computational basis. Throughout, random 
sampling of the systems is done without replacement. 

A general PPP-QKD protocol will proceed as follows: 

(1) Alice and Bob share untrusted states p using their un- 
derlying quantum resources. The state is supported on n 
bipartite systems A A' BB' , with . 

(2) (a) Alice and Bob jointly pick m x systems at random 
(by using local coins and 1-way public discussion) and inde- 
pendently measure the A and B parts in the computational 
basis, and combine the outcomes to obtain the observed er- 
ror rate e x . (b) For each j a ,jb, they jointly pick m z /D 2 ) 
sample systems and measure Oj a on AA' and Oj b on BB' 
independently. Then, for each candidate untwisting oper- 
ation, they combine the outcomes to obtain e z . They pick 
the lowest value. They measure the rest of the A and B 
systems in the computational basis. 

(3) Based on e z and e x , they apply an appropriate EC/PA 
procedure. 

(4) They obtain a key of rate determined by e z and e x . 

Step (2b) is the only placing differing from the standard 
EC/PA-Lo-Chau schemes. 

A general P/M-QKD has the following form: 

(1) Alice and Bob share some underlying untrusted channel 
J\f acting on BB' and can use it n times. They agree on 
some po supported on system AA' BB' . 

(a) For each j a , let \i^i)aA' be the eigenvectors of Oj a , and 
Alice transmits a state tv AA '[(\tpi){tpi\AA' ® Ibb')paa'BB> 
(renormalized) via one use of the channel (without know- 
ing what happens to the actual transmission) . This state 
is labeled by j a and I, and let the normalization be p* a (l). 
The (j a , I) state is transmitted via n c p 3a (l) randomly cho- 
sen uses of the channel. The rest of the channel uses is the 
same but always has j a = 1. 

(b) For each jb, Bob measures Oj b on n c randomly chosen 
channel outputs. The rest are measured in 0%. 



(2) Alice and Bob then start public discussion. They use 
systems transmitted based on Oj a and measured based on 
Oj b to calculate the average of Oj a Oj b (the average of IV 
where Alice transmits state labeled by / and Bob's outcome 
is I'). 

(a) They obtain a direct estimate of e x . 

(b) For each candidate untwisting operation, they obtain 
an estimate of e z , and they pick the lowest value. 

(3) EC/PA is applied. 

(4) A key is generated. 

VI. Optimal untwisting 

In our QKD protocol (both the purification-based and the 
P/M variant), the key rate is determined by the estimated 
e x ,e z , and once these are measured, we optimize over the 
EPP or the EC/PA procedure. 

Consider, for each U, the PPP-QKD scheme again. 
Given p, we want to find e x , e z such that p G S e = 
{U® n (V £ (<!>AB) ® pTb>)U^ u } and generally, such e x ,e z 
will depend on U. As long as p £ S e for some U, e is a 
legitimate estimate and EPP will produce a secure key of 
appropriate length. (Untwisting only occurs in our inter- 
pretation of the sampled data.) To exploit this feature, 
note that more precisely, e x is independent of twisting, but 
e z is not. Thus, for a list of possible twisting operators Ui, 
Alice and Bob should estimate each of the corresponding 
twisted phase error rate e Z i and take the minimal one to 
optimize the key rate extractable in EPP. At a first glance, 
they will need to measure r™~ for each Ui- But recall that 
each twisted phase error is derived from the decomposition 
given by Eq. (|10|) and from estimating the product observ- 
ables in the decomposition. For different the same set 
of product observables are measured, and the detail on Ui 
only enters the QKD protocol in the coefficients in the de- 
composition Eq. (fTT)|) . and thus, the same set of product 
observables can be used to calculate all possible e z j, and 
the optimization over twisting operator is an entirely clas- 
sical computation problem. 

Similar analysis holds for P/M-QKD. Just like PPP-QKD, 
the choice of Ui only enters the protocol via the classical 
computation of the estimate e Z i- Thus, Alice and Bob runs 
the protocol as stated before, but now with extra minimiza- 
tion of e z i over all possible Ui in their classical computation, 
followed by the appropriate EC/PA procedure. 

VII. Discussion 

We have seen that for any channel which allows for the 
distribution of key distillable states, there exists a proto- 
col for verifying security. The protocol is related to the 
scheme of Lo and Chau, the difference being that phase 
errors become twisted phase errors, and are measured by 
decomposing this operator in terms of product observables. 
Accuracy of this procedure is due to the exponential quan- 
tum de Finetti theorem [6] and the usual Chcrnoff bound 
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and sampling theory. Security of this protocol was proven 
by reduction to the Lo-Chau proof of security. We then 
converted it to a prepare and measure scheme which has 
the advantage of not requiring quantum coherent control. 
Furthermore, one can classically optimize over the twisting 
operation to minimize the corresponding twisted phase er- 
ror rate, and thus maximize the key rate. More generally, 
each EPP-QKD protocol that involves parameter estima- 
tion on a small fraction of sample systems and only com- 
putational basis measurement and classical processing of 
the data has a PPP-QKD analogue and a P /M-QKD ana- 
logue. Paradoxically, though the heart of the security proof 
relates to entanglement purification, it never needs to be 
done in the actual protocol, nor is noiseless entanglement 
needed in our scheme. In particular, our protocol can be 
based on bound entangled states or binding entanglement 
channels with zero entanglement rate or quantum capacity. 

This demonstrates conceptually that quantum key distri- 
bution is not equivalent to the ability to send quantum 
information. The "information gain implies disturbance" 
effect is strong enough to provide security even in such 
noisy regime. It also means that the ability to perform 
near perfect error correction on any logical space is unnec- 
essary. 

As a side result, the procedure outlined in the Appendix 
can be of independent interesting - in particular, it follows 
that the average value of an observable O on a large num- 
ber of systems (without any underlying structure) can be 
estimated by measuring a sublinear sample even if O has 
to be measured indirectly in terms of a decomposition to 
other observables (in our case product observables) that do 
not necessary commute with O. 

We have noted that some of the states or measurement 
results are not used in the analysis. Further research will 
exploit such data to improve on the key rate. 

It will also be interesting to study the alternative protocol 
based on state tomography discussed in Sec. II-EI and pQ 
and investigate possible advantages (such as that on the 
key rate). 

A big open question is whether all entangled states can be 
converted via LOCC to pbits, and related to this question 
is whether all binding entanglement channels can be used 
for QKD. 

Finally, our protocol is restricted to some classes of twisting 
(e.g., tensor power twisting). It will be interesting to ei- 
ther show the possibility of QKD in the case of completely 
arbitrary twisting or to obtain a no-go theorem. 



Appendix 

i. locc estimation of the expectation of an iid 
observable 

A. Finite quantum de Finetti theorem and generalized 
Chernoff bound 

We say that a state p n on Hilbert space 7i® n satisfies the 
Chernoff bound with respect to a state a on TL and a mea- 
surement M. on TL if (with high probability) the relative 
frequency distribution obtained by measuring J\A® n on p n is 
close to that of measuring M. on a. For example, p n = a® n . 
However many other states satisfy the same property. An 
important class is called almost power states, which are for- 
mulated and studied in [6]. We adapt results in [6] for our 
own purpose in the following. 

Theorem 1: (Finite quantum de Finetti theorem plus 
Chernoff bound) Consider any permutationally invariant 
(possibly mixed) state p n +k on Hilbert space H® {n+k \ Let 
p n = Ttkpn+k be the partial trace of p n +k over k systems. 
Let < r < n/2. Then there exists a probability measure 
fi on (possibly mixed) states a acting on TL and a family of 
states p n a l such that 

1. The state p n is close to a mixture of the states p n a } r 



Pn 



P (CT) 
rn,r 



dp(a) 



<2e 2 <"+ fc > 



+ idim(-H) 2 ln/c 



(23) 



2. The states pn)r (called almost power states) satisfy the 
Chernoff bound in the following sense 

Pr (jp M (a) - Q M [pl$]\\ >S) 

< 2 -»[£-*w]+\w\w$+i) = . <S) (24) 

where A4={M w } we w is any measurement on TL, Pm((t) = 
{T[(aM w )} w , QmIPu)-] is the frequency distribution ob- 
tained from measuring J\A® n on the state pn,l, and \W\ is 
the size of the alphabet W. 

3. Reduced density matrices of the states pn} r (to n' < n 
systems with r < n'/2) satisfy the same Chernoff bound: 

< T n ' [^- H ^] lo s(T +1 ) (25) 
where p^\ n ' ~ T^n-n' Pn,r is the resulting state after par- 



tial tracing n — n 1 systems from p n a l and Q' M [p n G> r n ,] is the 
frequency distribution obtained from measuring A4® n on 
the state p£J. n ,. 

Throughout the theorem, the probability is taken over the 
actual measurement outcomes that defines the frequency 
distributions. We also use [•] for frequency distributions 
defined by measurement outcomes whenever appriopriate. 

Proof: We first collect various facts, definitions, and results 
from [6]. 



» 
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A.l Fact and definitions 

Definition 4- Almost power state: (Def. 4.1.4, in |6j) 

Suppose < r < n. Let Sym(H® ra ) denote the symmetric 
subspace of pure states of Hilbert space H.® n . Let \6) G H 
be an arbitrary pure state and consider: 

V(H® n , \6)® n - r ) := {n(\9)® n - r <g> |W)) : ir G S n , 

\A) G H® r } 

where is the permutation group of the n systems. We 
define the almost power states along \9) to be the set of 
pure states in 

|0)[B,»,r] := Sym(ft 0n ) n span(V(H®", |(9)®"- r )) (26) 

We denote the set of mixtures of almost tensor power states 
along \9) as conv(|6>) [ ^"< rl ). 

With the above definition, we shall prove the following 
lemma: 

Lemma 1: If g n G conv(|6 l ) [ ' g, ' n ' r ']), then, Q n - m G 
conv(|^)[ 0,Tl ~ m '' r l) where Q n - m = Tr m (g„) is the reduced 
density matrix after the partial trace over any m out of 
the n systems (by symmetry, without loss of generality, we 
take the first m systems). 

Proof .- 

Since membership in conv(\8)^' n ~ m ' r ^) is preserved un- 
der mixing, it suffices to prove the lemma for pure g n = 
| *„)(*„ |, with |tf„) G \6) [»-"•'•]. 

We can pick an ensemble realizing g n - m of our choice, and 
prove the lemma by showing that any element \^ n -m) m 
that ensemble belongs to \Q)\®' n ~ m - r \ . Our ensemble is ob- 
tained by an explicit partial trace of \^ n ) over the first m 
subsystems along the computational basis. An element is 
given by 

|*„_ m ) = (ii|...(i m | ® J„_ TO |# n ). (27) 
Now, we note two facts: 

(i) |* n _ m ) G SymCW®^-" 1 )), since |* n ) G Sym(H® n ). 

(ii) |*„_ ro ) G v(H^™- m ,|6»)»("- m - r )) - This is because 
|<&„) G V(7i® n , |6»)»("- r )), and expressing |* n ) in terms of 
the spanning vectors of V{H® n , \9)® { - n -^) and putting it 
into Eq. (f27|) . we have 

|*»-m) = U*rAh\---{i m \®In- m K{\0)® {n - r) ®\*r))- 

Elementary analysis shows that any term of the 
above sum is, up to permutation, of the form 

((ii\e))---((ip\e))\e)® n - r -" ® [(i P+1 \ ■ ■ ■ (i m \ ® i r _( m _ p) w'a*,.))] 

where < p < m, and "absorbing" m — p copies of 9 to the 
last part of the vector, we get |6»)® n -( r+m ) ® |#' r ')- Thu s, 
|* TO _ n ) is a sum of terms of the form 7r(|0)® n -( r + m ) <g> 
I and belongs to V(ft®(™- m ), |6>)®"-( r+m )). This 
proves the second fact, and also the lemma. □ 



The next lemma asserts that a mixture of almost tensor 
power states behaves approximately like a mixture of ten- 
sor power states with respect to a generalized version of 
Chernoff bound. 

Lemma 2: (Theorem 4.5.2 of [6]) Let < r < f , 

\9) G H, and |*„) be a vector from \8}&> n > r l Let 
M = {M w } weW be a POVM on H, P M (\0)(9\) be the 
probability distribution generated by applying the mea- 
surement to |0)(0| (i.e., P M {\9){9\) = {Tr\9)(9\M w } w ), and 
Pm [\^n) (^n\] be the relative frequency distribution of out- 
comes of M® n applied to |*„)(*„|. Then, 

Pr (\\P M (\9)(9\) ~ P M {\* n )(<f n \}\\ > 6) 

< 2 -"[T-^(^)]+l^l°g(f + l) = . 

where the probability is taken over the outcomes. Note 
that we have used e(8) instead of 5(e) in [6]. 

Consider the general probability Pr(||Pjvi (p) — -Pm[£«]|| < 
6) where Pm [g n ] is a frequency distribution of outcomes 
of M® n applied to |*„)(*„|. The distribution P M [g n ], if 
treated as a functional of g n on the space "H® 71 , is linear in 
g n . Following this we get immediately: 

Corollary 1: Lemma [5] holds when replacing the projector 
|tf„)(tf„| (for |* n ) G \9)&' n ^) by g n G conv(|0) I®'"^). 

Apart form the generalized Chernoff-type lemmas, we also 
need the crucial exponential quantum finite de Finetti the- 
orem: 

Theorem 2 (Theorem 4.3.2 of [6]) For any pure state IVVi+fc) G 
Sym(7Y® n+fc ) and < r < n there is a measure dv(\9)) on 
TL and for each \6) G H there is a pure state \ipn) 6 

|^[®,n,r] sudl that 

Tr k \i, n+k )(i> n+k \ - [ \^)(^\dv{\6)) 

JH tr 

< 2e~ ^"+^ + ^ dim(?<)lnfc (28) 

Finally, we need the fact that any permutationally invari- 
ant state has a symmetric purification. 

Lemma 3 (Lemma 4.2.2 of [6]) Let p n be a permutation- 
ally invariant state on H® n . Then there exists a purifica- 
tion of the state in Sym((ft <g> H)® n ). 

This concludes the list of facts and definitions needed for 
proving Theorem [TJ 

A. 2 Proof of Theorem [1] 

Consider an arbitrary permutationally invariant state g n +k 
on Hilbert space H® {n+k) . 

Step (1): According to Lemma [3] there is a purification 
|V>„+/c) that belongs to Sym(W® n+fe ) where W =H®H 
and dim(W) =dim(7i). 

Step (2): We apply Theorem[2]to \ip n +k) with the changes 

H H' =H®H 

d -> d 2 (29) 
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Step (3): After application of theorem [2] we perform par- 
tial trace over 7i® n , the purifying space introduced in step 
(1). We denote this partial trace by Tr. This partial trace 
induces from the measure for pure state on TC in step (2) a 
new measure p(o~) on the set of all mixed states a acting on 
Ti. (The probability of a is given by the total probability of 
all \8) with Tr(|6*) (9\) = a). This partial trace produces also 
the states pn)r defined directly by pnjr = Tr^i/i^X 7/4^1) 
where the existence of the pure states \tpn } is guaranteed 
by Theorem [2j Finally we note that partial trace does not 
increase the trace distance between two quantum states, so 
applying partial trace to the LHS of (|28|) and using the no- 
tation described above we get immediately the inequality 
([23| . This proves the first item of Theorem |lj . 

To prove the second item of Theorem ([1]) , remember from 

the above that pfy = Tr^V'n^XV'"^!)- Since \ipn^) is an 
almost power pure state, lemma [5] applies. Further, it holds 
for all POVM-s on H' = Ti (g) H, and in particular for 
incomplete POVM-s acting only on Ti but not on Ti. Thus, 
the conclusion of lemma [2] holds with the change: M. — » 
M. ® I, which gives item (2). 

Finally, to prove item 3 of theorem [TJ note that the re- 
duced density matrices Q a n r n , of interest can be obtained 

from the pure state \ipn } above by tracing (i) first over 
n— n' subsystems corresponding to Ti' , producing a state 
on H'® n , and (ii) then over n' subsystems corresponding 
to Ti. 

Then lemma Q] guarantees that the first partial trace pro- 
duces a mixed state g n i in conv(|6>)[®' n ,n -r l) (with under- 
lying space Ti.' . Applying corollary[T]to g n i with n' instead 
of ?i, it suffices to consider a pure state in ~ r '. 
Finally, lemma [2] can be applied to this pure state with 
M. — » M. ® I which concludes item 3. □ 



hand, Eq. ([6]) has better constants in the exponent.) 
B.2 From probabilities to averages 

Lemma J±: Consider an observable L on Hilbert space Ti, 
dim7i = d. Let L — Yli=i s iLi, where {Li} is a trace or- 
thonormal basis for operators (i.e., TrL^L^ = &y). Let the 
eigenvalues of Li be denoted by A, . Consider an arbitrary 

state p, and let = {p[^} be the probability distribu- 
tion on I (which eigenvalue) induced by measuring Li on 
p. Let = {q^} be an arbitrary family of distributions 
on the eigenvalues of Li. We then have 



< Vi\\L\\ r, c max |P (i) - Q 



Wl 



(31) 



where || • \\hs is the Hilbert-Schmidt norm and || • || tr is the 
trace norm. 



Proof 



< (maxlAfl) \\P^-Q 



)W| 



£| Si | ||ii||oc ||-P W - Q 



< (max||pW-Q«|| tr )£| Sj .| \\L 3 



(32) 



B. Two other useful results 
B.l Classical random sampling 

In addition to the fact and definitions above and Theorem 
[T] we will need the following result on classical random 
sampling (or equivalently symmetric probability distribu- 
tion). 

Proposition 1: (Classical sampling theory) Lemma 
A. 4 from [23] . Let Z be an n-tuple and Z 1 a fc-tuple of 
random variables over a set Z, with symmetric joint prob- 
ability Pzz 1 ■ Let Q z / be the relative frequency distribution 
of a fixed sequence z' and Q< z ,z') be the relative frequency 
distribution of a sequence (z, z 1 ), drawn according to Pzz' ■ 
Then for every e > we have 



Pr(IIQ(^<) 



> e) < \Z\e 



-ke 2 /8\Z\ 



(30) 



The result says that the relative frequency distribution ob- 
tained from a small sample is close to one obtained from 
the whole system. (This lemma is similar to Eq. (|6|), but 
stronger in two respects - it applies to any dimension and 
has no restriction on the fraction sampled. On the other 



where |j • ||oo is the operator norm. Since H-Z^Hoo < 1, using 
convexity of x 2 we obtain 



E N H £ iH<» ^ E N ^ A/E s ? = St \\L\\hs (33) 
3=1 j y j 

This completes the proof. □ 

C. Estimation - detailed description 

We consider 2m + n systems with Hilbert space 7^®( 2m +") ; 
dim7i = d in a permutationally invariant state Q2m+n- 
Suppose the ultimate goal is to obtain the "empirical mean- 
value" of some single-system observable S on a sample 
of n + m systems. In other words, we want to measure 
F T,f=i where s0) = L ® L ® ■ ■ ■ ®Y, ® ■ ■ ■ ® L on the 
N subsystems for N — n + m. 

Because of experimental limitations (here, it is the LOCC 
constraints on Alice and Bob), they are restricted to mea- 
suring product operators of the form L = La ® Lb by 
independently finding the eigenvalues of La and Lb (i-e., 
making the measurements La <£> / and / ® Lb), discussing 
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over classical channels and multiplying their outcomes to- 
gether. Now, to measure E, one can first rewrite it in terms 
of product operators Lf. 



»=i 



(34) 



where we have chosen {Li} to be hermitian and trace or- 
thonormal, so that Si are real. The Lj-s are "intermediate 
observables." We will describe an inference scheme that (1) 
involves only the estimation of the "empirical mean- value" 
of E on a small number (m) of subsystems, and (2) the 
measurement of E is done indirectly via measurements of 
the Li's. 

The analysis will start with a special assumption about 
the 2m-element sample, m of which are used for indirect 
estimation. The assumptions are relaxed on that sample. 
After that properties of the other m + n subsystems are 
inferred. 



where the probability is taken over the measurement out- 
comes, d is the dimension of the single site Hilbert space, 
and 

e(S, n, r, \Z\) := 2 -^~ H ^ n+ \ z \ lo ^ +1 > (38) 

Proof - Follows immediately from the third item of Theo- 
rem [TJ Note that we use item (3) not (2) since we perform 
the measurement only on part of the state r - 

Remark - Note also that P t is constant while Qi is a random 
variable. 

Now, we define the theoretical average values for the inter- 
mediate obscrvables L;'s: 



(Li} a = Tr(LjCr) 
and the empirical averages 

(0/ 



{Lj) em p — Aj Qi(l) 



(39) 



(40) 



C.l Analysis of the 2m sample in an "almost power state 



along o-":q& 



Suppose the first 2m subsystems are in a joint state Q^m ri 
with r < i x 2m. We expect the state r t° play a role 
similar to the state er® 2m . Define the theoretical direct 
average 

(E) CT =Tr(Ea) =^ Sl (L, ; ) a (35) 

i 

We now consider an indirect measurement of E applied 
on the first m subsystems, and a direct measurement on 
the next m subsystems. We will show that the empirical 
average, either obtained directly or indirectly, will be close 
to the above. 

For the indirect measurement, divide the first m subsys- 
tems into t groups. Each group has m! — m/t subsys- 
tems. Alice and Bob take the ith group (i — 1, ■ ■ ■ ,t) and 
measure Li on each site as described above (the measure- 
ment is Ci). In other words, the measurement M mdlrcct = 
®\=l{£-t m ) is applied to the first m subsystems of the en- 
tire 2m + n subsystems. 

We expect g^m r an ^ <J ^ m to behave similarly. In partic- 
ular, consider an observable Li — J2i l^Pi ) I ex ~ 
pressed in its spectral decomposition and the probability 
distribution on the set of eigenvalues Ai induced by the 
state a as follows: 



Pi 



{TrHV^K^I)}, 



An execution of the measurement Cf m gives a particular 
outcome (1%, ...,Z m <) and induces a relative frequency dis- 
tribution Qi on Ai- 

Then, the empirical frequency distributions Qi is close to 
the "theoretical" distribution Pi. 



Fact 1: 



Pr(\\P i -Q i \\ tI >S)<e(5,m',r,d), 



where Qi(l) denotes the value of Qi on a specific event 
/ in the alphabet Ai- (Again, (Li) a is constant while 
(-Lj)emp is a random variable depending on the particu- 
lar outcomes of the measurements, and recall that Li — 
i 1^/ : : I)- Denote the empirical value of E ob- 
tained indirectly via the empirical averages of the Li's by 

<£>^ ind = E s ^w ( 41 ) 

i 

We now show that the indirect empirical average is close to 
the direct theoretical average in Eq. (|35|) . First applying 
the union bound to Fact [TJ we get 



Pr(U i=1 ,...,a||Pi - Qi\\ tI >5})<t- e(6,m',r,d) 
Then using Lemma U we obtain that 



(42) 



Pr(|(E) CT -(E)(™)- ind | >5 



Pr 



t ■ e 



^Si(Li)a- - y^^Sj(Lj) 

V\ 111 ) 

i i 

(p^7?' m '' r ' d ) ■ 



> S 



(43) 



We emphasize once again that the probabilities are taken 
over the measurement outcomes. 

After considering the indirect measurements, suppose that 
someone measures directly M dlrcct = -3- Ylj=m+i on 
(36) the second group of m subsystems. Denote the empirical 



average outcome by (E 



(m) ,dir 



In a way similar to the in- 



direct case (but much easier here) we show that the empir- 
ical direct average is close to (E) CT in Eq. (|35|) (by applying 
Lemma S] with t = 1): 



Pr 



(|(E) ff -(E)(^ dh j><5) <e( 



,m,r,d) . (44) 



(37) From the inequalities (|43p . (|4"4")l we obtain 
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Lemma 5: For the measurements on the state f5 2 m r con- 
sidered above we have: 



Pr 



< t ■ e 



'yn (m) ,dir 
\ I ernp 



/yi\ (m) ,md 
V^Wcmp 



> 25J 

l '' r ' d ) + e ( ||S|t g 5 ' TO ' r ' rf ) 



^ ( t+1 )- e (KV?' m >' d ) ^ 
where the probability is taken over measurement outcomes. 

Proof .- Here triangle inequality and union bound to in- 
equalities (|43)) , (|44|) suffices together with the properties 
of e(5,n,r,d). Note that the indirect and direct measure- 
ments are performed on disjoint subsystems, so that there 
is a probability space for the joint outcomes. 



C.2 Passing from p 2 m r~ s to their integrals and then to a 
close-by state 

Note that both integration and the measurement of a 
state to produce the classical distribution of the outcomes 
are both linear, completely positive, and trace-preserving 
maps. Thus, Lemma [5] still holds under the replacement 
£2m,r / Q2m,r d V{°)- Furthermore, if 



£'2, 



< e. 



(46) 



We can use the fact that the trace distance is nonincreasing 
under the measurement (a TCP map) to prove the follow- 
ing: 

Lemma 6: For a state p 2m of 2m systems satisfying \\Q2m 
j £>2m .r^Wlltr < e we have 



m) ,ind 
emp 



> 26 



< 



Pr(|<S)M' dil '-(S)( 



(47) 



where the probability is evaluated over the probability dis- 
tribution V' on outcomes of measurement Cf m <g> ... ® 
Cf m ' ®M® m induced by the state g 2m . 



where 

ei = 
e 2 = 
S3 = 



2 e ~ 2(2rr 1 + n) + 2 C ' [U n 

(t+l)2 4tMS|l HS + ' 



(49) 



Proof - The parameters ei, e 2 , e% come from the generalized 
quantum de Finetti theorem, the Chernoff bound and the 
sampling proposition respectively. 

To start, we apply item (1) of Theorem [1] to Q2 m = 

Tr n £ 2 m+« to obtain \\g 2m - J g 2 m,r d ^( a )\\tr < e witri 
e = e\. Then, we apply Lemma [6] to get 



Pr 



(|<S)(™)- dir -(I])(»^ d |>25)<e 1 +e 2 



(50) 



Now we need to connect (S)emp' md with (S)iSp ■ For 
this we need the fact that M. has at most d outcomes, 
and we need the random sampling theorem, Proposition [1] 
which gives Pr(||Q£ - <3™ +n ||tr > S) < de~ mS / 8d where 
Qg l is the relative frequency distribution on outputs of M. 
induced by the state p m (partial trace of P2m+n over m + n 
systems and Q^ +n is the relative frequency distribution 
induced on the outcomes of M. by the state p m +n (partial 
trace of p2m+n over m systems) and d is the dimension 
of the elementary Hilbert space TL (thus Q2 m is defined on 
W 82m . Using Lemma |4] (taking t — 1 here for the direct 
measurement) we go to the averages 



Pr(|(E)(™)- dir 



/y\(m+n) 
\ /emp 



> 35) < e 3 . 



(51) 



Applying the union bound to Eqs. (|50|) and ([51]) we obtain 
the statement of the theorem. 
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C.3 Inferring direct average on n + m samples of general 
state Q2m+n from indirect measurements on m samples 

Now we pass to the general permutationally invariant state 
Q2m+n- We have the following: 

Theorem 3: Consider permutationally invariant state £>2m+i 
on 7Y® 2m+n and dim7Y = d. On this state we perform the 
measurement Cf m ' <g> • • • ® Cf m ' ® M® m+n which induces 
the probability measure V" . (Note that V' is simply the 
marginal of V" .) Evaluating the probability over V" , we 
have 

Pr(|(S)(™)' ind - <S)^ n) )l > 3(5 ) < ei + e 2 + e 3 (48) 
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